Skip to content

Stop misclassifying unquoted string parameters as keywords#21

Merged
magnesj merged 10 commits into
OPM:mainfrom
magnesj:feature/13-unquoted-string-params
May 19, 2026
Merged

Stop misclassifying unquoted string parameters as keywords#21
magnesj merged 10 commits into
OPM:mainfrom
magnesj:feature/13-unquoted-string-params

Conversation

@magnesj
Copy link
Copy Markdown
Member

@magnesj magnesj commented May 19, 2026

Summary

Fixes #13: YES under SCALECRS, THPRES under EQLOPTS, and similar unquoted string values were highlighted, linted, and shown in the docs panel as if they were keyword declarations. OPM Flow itself only recognises keywords in column 1; align grammar, analyzer, and cursor-driven providers with that rule.

Testing the fix surfaced several related real-deck issues. Each is its own commit so changes can be reviewed independently.

What's in the PR

  • String parameters without using quotes are interpreted as keywords #13 — unquoted indented uppercase tokens are record values, not keywords. Grammar anchored to column 1; analyzer treats indented keyword-shaped tokens inside an open record block as record values; cursor-driven docs/hover require the word to begin at column 0.
  • Include-file extensions — register 17 extra extensions seen in opm-tests decks (.eqldims, .regdims, .gridopts, .swatinit, .faults, .trans, .vfpprod, .pvtnum, .rxvd, .grid, .fipzon, .permx, …) so those files load with the opm-flow language id.
  • FIP user-defined regionsFIPZON, FIPGL, FIPNL, FIPUNIT, FIPHC resolve through the existing templating path against a new FIP base entry.
  • SUMMARY list-of-name mnemonics (WOPR, WGPR, …) — reshape 976 entries from size_kind: fixed/1 to array so multi-line bodies don't get a per-line "missing terminating /" diagnostic.
  • Bare-stacked SUMMARY mnemonics (GMWPR \n GMWIN \n /) — new optional_body flag tagged on 981 non-F mnemonics so empty bodies don't demand a closing /; once values appear the diagnostic still fires.
  • UDQ SUMMARY mnemonics (WUWI1, FUOIL, GUTOT, …) — strip the trailing X's from the manual's FUXXXXXX/WUXXXXXX/… placeholders and mark the 2-char prefixes templated.
  • MESSAGES — single 13-INT record canonically split as print-limits/stop-limits across two lines. Tagged variadic_record: true via a manually-curated set in the build script.
  • VFPPROD / VFPINJ — empirically end with the trailing record's /, no standalone closer. Reclassified from list to fixed plus variadic_record: true so per-line and final-terminator checks are suppressed.
  • Grammar: bare uppercase tokens at column 1 inside records. Tighten the keywords pattern so it only matches a line that is the keyword alone (optionally followed by a -- comment or a single /). Add a low-precedence unquoted-strings (string.unquoted.opm-flow) pattern so well/group/property names like OP01, FIELD, UPPER, SGL/SOWCR in EQUALS/GRUPTREE/WCONHIST records render in the string color rather than being mis-painted as keywords.

Tests

189 jest tests (up from 168), including regression cases for every issue listed above. Grammar changes aren't jest-covered and were spot-checked against the user-reported decks in the Extension Development Host.

magnesj added 10 commits May 19, 2026 08:24
YES, THPRES and other uppercase string values in record lines under
SCALECRS, EQLOPTS, etc. were highlighted, linted, and shown in the docs
panel as if they were keyword declarations. OPM Flow itself only
recognises keywords in column 1; align the editor with that rule:

- Grammar: anchor the keywords rule to column 1.
- Analyzer: treat indented keyword-shaped tokens inside an open record
  block as record values, even when the name is in the index.
- Editor: introduce KEYWORD_LINE_COL1_RE and use it for active-keyword
  scan, docs panel, hover, and folding so clicking THPRES under EQLOPTS
  shows EQLOPTS docs rather than the unrelated SOLUTION THPRES.
Adds 17 extensions used by opm-tests decks (eqldims, regdims, gridopts,
swatinit, faults, fipzon, permx, pvtnum, rxvd, trans, vfpprod, grid,
prpecl, dat, incl, sched, smry) so these files load with the opm-flow
language id and pick up syntax highlighting, diagnostics, hover, etc.
The OPM manual states a FIP keyword name is "FIP as the first three
characters followed by up to a five letter character string", producing
deck tokens like FIPZON, FIPGL, FIPNL, FIPUNIT, FIPHC. Mark FIP as a
templated base name so the existing FIP+[A-Z0-9]+ fallback resolves
those tokens to the FIP entry. Direct entries (FIPNUM, FIPOWG, FIPSEP,
FIP_PROBE) still win via the direct-lookup path.
W/G/C/R/B/A/N/S-prefixed SUMMARY mnemonics take a list of names closed
by a single '/'; names may sit inline or be spread across many lines.
The previous fixed/1 classification mis-flagged each intermediate name
line as missing the terminating '/'. Reclassify these 976 mnemonics as
size_kind 'array' so per-line terminator checks are skipped but a
missing block-end '/' is still flagged.
Non-F SUMMARY mnemonics may legally appear without a body — bare lines
stacked back-to-back and closed by a single trailing '/' is a common
real-deck pattern:

    GMWPR
    GMWIN
    /

Add an `optional_body` flag on AnalysisEntry and tag all 981 non-F
SUMMARY array mnemonics with it. The close-block terminator check now
skips entries whose body was empty (recordCount === 0); once names are
listed, the missing-'/' diagnostic still fires.
UDQ SUMMARY variables are user-named: the manual documents them as
placeholders FUXXXXXX, WUXXXXXX, GUXXXXXX, CUXXXXXX, RUXXXXXX,
SUXXXXXX where the trailing X's stand for the user-defined name (up
to six characters). Real decks write tokens like WUWI1, FUOIL, GUTOT.

The 8-char placeholders sat in the index but never resolved: the
template fallback requires the base name to be shorter than the deck
token. Strip the trailing X's so each entry is keyed by its 2-char
scope prefix (FU, WU, GU, CU, RU, SU) and mark them templated. The
existing <base>+[A-Z0-9]+ fallback then resolves WUWI1, FUOIL, etc.
to the correct shape. Direct lookups for FULLIMP, GUIDECAL, RUNSPEC,
SURFACT, ... still win because direct beats template.
MESSAGES is a single record of 13 INT parameters that real decks
routinely split across two lines — print limits then stop limits,
closed by a single trailing '/':

    MESSAGES
      80000 10000 5000000 5000  300   1
      80000 10000 5000000 80000  10   1 /

opm-common doesn't flag any MESSAGES item as size_type: "ALL", so the
existing variadic-record auto-detection never tagged it. Add a manually
curated VARIADIC_RECORD_KEYWORDS set in the build script and a matching
'variadic_record: true' on the MESSAGES index entry so per-line missing
'/' diagnostics are suppressed.
opm-common classifies VFPPROD and VFPINJ as 'list' (size = string
sentinel), which triggered (a) per-line "missing terminating '/'"
diagnostics on every line of the multi-line LIQ/THP/WFR/GFR/ALQ/BHP
tables and (b) a closeKw "missing terminating '/' to close the record
list" because real decks never close the block with a standalone '/'
— the next keyword does. Add the keywords to VARIADIC_RECORD_KEYWORDS
(suppress per-line check) and a new NO_LIST_TERMINATOR_KEYWORDS set
that reclassifies them to 'fixed' with size_count = records_meta
length (suppress final terminator check). Both axis-table and
BHP-table forms now parse cleanly.
keyword line pattern

Tighten the keyword rule so it only fires when the line is the
keyword alone, optionally followed by a '--' comment or a '/' (this
keeps inline 'KEYWORD /' colored). Record lines starting with an
uppercase identifier at column 1 — well/group names like OP01,
property names like SGL/SGCR/SOWCR in EQUALS, group names like
UPPER/FIELD in GRUPTREE, enum values like OPEN/ORAT in WCONHIST —
no longer pick up the keyword color.

Add a low-precedence 'unquoted-strings' pattern (string.unquoted.opm-
flow) that catches any remaining bare uppercase identifier, so those
same record-line tokens now render in the string color instead of
falling through unstyled. Quoted strings, variables, defaults,
numbers, and terminators keep their existing scopes because they
match earlier in the precedence list.
Six pytest cases still asserted the old 'fixed/size_count=1' shape
for non-F SUMMARY mnemonics. Update them to the new 'array' +
optional_body=True shape introduced when WOPR/GMWIN multi-line and
bare-stack bodies were fixed, and import the new _summary_optional_body
helper.
@magnesj magnesj merged commit 7f51ea6 into OPM:main May 19, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

String parameters without using quotes are interpreted as keywords

1 participant